|
|
An Attribute Reduction Algorithm Based on Granular Computing and Discernibility |
JI Su-Qin, SHI Hong-Bo, Lü Ya-Li |
Faculty of Information Management, Shanxi University of Finance and Economics, Taiyuan 030031 |
|
|
Abstract In traditional attribute reduction algorithms, all the data are loaded into the main memory once, which is hard to adapt to the big data analyses. Aiming at this problem, an attribute reduction algorithm based on granular computing and discernibility is proposed. An original large-scale datset is divided into small granularities by applying stratified sampling in statistics, and then attributes are reduced on each small granularity based on discernibility of attribute. Finally, all the reductions on small granularities are fused by weighting. Experimental results show that the proposed algorithm is feasible and efficient for attribute reduction on massive datasets.
|
Received: 26 May 2014
|
|
|
|
|
[1] Pawlak Z. Rough Sets. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356 [2] Slowron A, Rauszer C. The Discernibility Matrices and Functions in Information Systems // Sowiński R, ed. Intelligent Decision Support. Dordrecht, The Netherlands: Springer, 1992: 331-362 [3] Hu X H, Cercone N. Learning in Relational Databases: A Rough Set Approach. Computational Intelligence, 1995, 11(2): 323-338 [4] Hu F, Wang G Y. Quick Reduction Algorithm Based on Attribute Order. Chinese Journal of Computers, 2007, 30(8): 1429-1435 (in Chinese) (胡 峰,王国胤.属性序下的快速约简算法.计算机学报, 2007, 30(8): 1429-1435) [5] Yao Y Y, Zhao Y. Discernibility Matrix Simplication for Constructing Attribute Reducts. Information Sciences, 2009, 179(5): 867-882 [6] Xu Z Y, Liu Z P, Yang B R, et al. A Quick Attribute Reduction Algorithm with Complexity of max(O(|C||U|), O(|C|2|U/C|)). Chinese Journal of Computers, 2006, 29(3): 391-399 (in Chinese) (徐章艳,刘作鹏,杨炳儒,等.一个复杂度为max(O(|C||U|), O(|C|2|U/C|)) 的快速属性约简算法.计算机学报, 2006, 29(3): 391-399) [7] Ge H, Li L S, Yang C J. An Efficient Attribute Reduction Algorithm Based on Conflict Region. Chinese Journal of Computers, 2012, 35(2): 342-350 (in Chinese) (葛 浩,李龙澍,杨传健.基于冲突域的高效属性约简算法.计算 机学报, 2012, 35(2): 342-350) [8] Miao D Q, Hu G R. A Heuristic Algorithm for Reduction of Know-ledge. Journal of Computer Research & Development, 1999, 36(6): 681-684 (in Chinese) (苗夺谦,胡桂荣.知识约简的一种启发式算法.计算机研究与发展, 1999, 36(6): 681-684) [9] Wang G Y, Yu H, Yang D C. Decision Table Reduction Based on Conditional Information Entropy. Chinese Journal of Computers, 2002, 25(7): 759-766 (in Chinese) (王国胤,于 洪,杨大春.基于条件信息熵的决策表约简.计算机学报, 2002, 25(7): 759-766) [10] Liang J Y, Qian Y H. Information Granules and Entropy Theory in Information Systems. Science in China: Series F, 2008, 51(9): 1427-1444 [11] Qian Y H, Liang J Y, Pedrycz W, et al. Positive Approximation: An Accelerator for Attribute Reduction in Rough Set Theory. Artificial Intelligence, 2010, 174(9/10): 597-618 [12] Lu Z C, Qin Z, Zhang Y Q, et al. A Fast Feature Selection Approach Based on Rough Set Boundary Regions. Pattern Recognition Letters, 2014, 36: 81-88 [13] Li M, Shang C X, Feng S Z, et al. Quick Attribute Reduction in Inconsistent Decision Tables. Information Sciences, 2014, 254: 155-180 [14] Qian J, Miao D Q, Zhang Z H, et al. Parallel Attribute Reduction Algorithms Using MapReduce. Information Sciences, 2014, 279: 671-690 [15] Liang J Y, Wang F, Dang C Y, et al. An Efficient Rough Feature Selection Algorithm with a Multi-granulation View. International Journal of Approximate Reasoning, 2012, 53(6): 912-926 [16] Liu Q, Sun H, Wang H F. The Present Studying State of Granular Computing and Studying of Granular Computing Based on the Semantics of Rough Logic. Chinese Journal of Computers, 2008, 31(4): 543-555 (in Chinese) (刘 清,孙 辉,王洪发.粒计算研究现状及基于Rough逻辑语义的粒计算研究.计算机学报, 2008, 31(4): 543-555) [17] Li J C. Application of Sampling Techniques. Beijing, China: Science Press, 2007 (in Chinese) (李金昌.应用抽样技术.北京:科学出版社, 2007) [18] Xu Y, Huai J P, Wang Z Q. Reduction Algorithm Based on Discernibility and Its Applications. Chinese Journal of Computers, 2003, 26(1): 97-103 (in Chinese) (徐 燕,怀进鹏,王兆其.基于区分能力大小的启发式约简算法及其应用.计算机学报, 2003, 26(1): 97-103) |
|
|
|